Add report generation for OSU Benchmark#807
Conversation
📝 WalkthroughWalkthroughAdds OSU benchmark parsing and CSV emission per run, a report-generation strategy, a comparison report producing tables/charts, registers and exports the new report/strategy, and extends tests to cover parsing, exports, and registry state. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile OverviewGreptile SummaryThis PR implements comprehensive report generation for OSU benchmarks, parsing stdout into CSV format and creating comparison reports with tables and charts. Key Changes:
Implementation Quality:
Confidence Score: 4/5
Important Files Changed
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@src/cloudai/workloads/osu_bench/osu_comparison_report.py`:
- Around line 65-142: The code reads each group's CSVs twice because
create_tables and create_charts each call extract_data_as_df independently; add
a simple per-report cache to avoid duplicate reads by implementing a helper
(e.g., _get_group_dfs) that checks/sets a dict on self (e.g., self._dfs_cache)
keyed by a stable identifier for the group (like tuple(item.tr.path) or
id(group)), and returns the list produced by [self.extract_data_as_df(item.tr)
for item in group.items]; replace the direct list comprehensions in
create_tables and create_charts with calls to _get_group_dfs and clear/init
self._dfs_cache appropriately (e.g., in report constructor or before processing)
so extract_data_as_df is invoked only once per group.
In `@src/cloudai/workloads/osu_bench/report_generation_strategy.py`:
- Around line 64-85: The two branches in _parse_data_row that handle
BenchmarkType.BANDWIDTH and the LATENCY fallback both return [parts[0],
parts[1]]; simplify by removing the explicit BANDWIDTH branch and collapsing
them into a single fallback that returns [parts[0], parts[1]] for any
non-MULTIPLE_BANDWIDTH case (after validating parts and the message-size int),
keeping the MULTIPLE_BANDWIDTH branch unchanged; this reduces duplication while
preserving behavior tied to _columns_for_type.
In
`@tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py`:
- Around line 63-112: Add three edge-case tests for extract_osu_bench_data to
cover its early-return branches: (1) "file not found" — call
extract_osu_bench_data with a non-existent Path and assert it returns an empty
DataFrame (no columns, shape (0,0) or however empty is represented in your
project), (2) "empty file" — create an empty stdout file and assert
extract_osu_bench_data returns an empty DataFrame, and (3) "unrecognized header"
— write a stdout file containing non-OSU output (e.g., "osu_hello" text) and
assert extract_osu_bench_data returns an empty DataFrame; reference the function
extract_osu_bench_data and the module osu_bench.py so tests exercise the
special-case handling added there.
tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/test_test_scenario.py (1)
528-556: 🧹 Nitpick | 🔵 TrivialCount updated correctly, but OSU bench mapping is not tested in
test_custom_reporters.The total count of 18 accounts for the new
OSUBenchTestDefinitionregistration. However,test_custom_reportersparametrize list (lines 531–553) only has 16 entries and doesn't includeOSUBenchTestDefinition→{OSUBenchReportGenerationStrategy}. Consider adding it for completeness.♻️ Suggested addition to parametrize
Add the import at the top of the file:
from cloudai.workloads.osu_bench import OSUBenchReportGenerationStrategy, OSUBenchTestDefinitionThen add to the parametrize list:
(AiconfiguratorTestDefinition, {AiconfiguratorReportGenerationStrategy}), + (OSUBenchTestDefinition, {OSUBenchReportGenerationStrategy}),
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/cloudai/workloads/osu_bench/osu_comparison_report.py`:
- Around line 71-148: The create_tables and create_charts methods repeat the
same check-and-append logic for three metrics; introduce a class-level metric
descriptor (e.g. _METRICS = (("avg_lat","Latency","Time (us)"),
("mb_sec","Bandwidth","Bandwidth (MB/s)"), ("messages_sec","Message
Rate","Messages/s"))) and refactor both methods to loop over _METRICS: for each
(col, title, y_label) build dfs via extract_data_as_df, call
self._has_metric(dfs, col) and then call self.create_table(group, dfs=dfs,
title=title, info_columns=list(self.INFO_COLUMNS), data_columns=[col]) in
create_tables and self.create_chart(group, dfs, title, list(self.INFO_COLUMNS),
[col], y_label) in create_charts; keep references to _has_metric, create_table,
create_chart and INFO_COLUMNS so behavior is unchanged.
- Around line 52-64: In extract_data_as_df, avoid calling df["size"].astype(int)
directly because non-numeric or NaN values will raise IntCastingNaNError;
instead coerce the column to numeric and drop invalid rows before casting: use
pd.to_numeric(df["size"], errors="coerce") (or lazy.pd equivalent) to convert,
call dropna() on that column (or drop rows where it is NaN), then safely cast to
int (or to a nullable integer dtype) and return the cleaned df; reference df,
csv_path, TestRun, and the extract_data_as_df method when making the change.
tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py
Show resolved
Hide resolved
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
|
@allkoow there's a merge conflict, please resolve I missclicked!! sorry. at least it resolved automatically |
Summary
osu_init,osu_hello).Test Plan
Tested on internal environment:
osu_latency,osu_bw,osu_get_bw,osu_put_bw,osu_bibw,osu_mbw_mr,osu_multi_lat.osu_bw,osu_latency.